Unification of XML Documents with Concurrent Markup
نویسندگان
چکیده
منابع مشابه
Unification of XML Documents with Concurrent Markup
Annotating multiple hierarchies with SGML-based markup systems is still one of the fundamental problems of text-technological research. Up to now, several solutions have been discussed (e.g. chapter 31 of the TEI-Guidelines (Sperberg-McQueen and Burnard 1994) and Barnard et al. (1995)). Furthermore, some non-SGML based approaches have been proposed. (cf. Huitfeldt and SperbergMcQueen (2001) ; T...
متن کاملQuerying XML documents with multi-dimensional markup
XML documents annotated by different NLP tools accommodate multidimensional markup in a single hierarchy. To query such documents one has to account for different possible nesting structures of the annotations and the original markup of a document. We propose an expressive pattern language with extended semantics of the sequence pattern, supporting negation, permutation and regular patterns tha...
متن کاملAutomating XML markup of text documents
We present a novel system for automatically marking up text documents into XML and discuss the benefits of XML markup for intelligent information retrieval. The system uses the Self-Organizing Map (SOM) algorithm to arrange XML marked-up documents on a twodimensional map so that similar documents appear closer to each other. It then employs an inductive learning algorithm C5 to automatically ex...
متن کاملXPath Extension for Querying Concurrent XML Markup∗
XPath is a language for addressing parts of an XML document. It is used in many XML query languages and it can be used by itself for querying XML documents. While XPath is, in general, efficient for querying individual XML documents, it lacks the features for querying over collections of documents or joining parts of the same document. As the amount of complex document-centric XML data is conti...
متن کاملInformation Extraction and Automatic Markup for XML Documents
As XML is going to become the standard document format, there is still the legacy problem of large amounts of text (written in the past as well as today) that are not available in this format. In order to exploit the benefits of XML, these legacy texts must be converted into XML. In this chapter, we discuss the issues of automatic XML markup of documents. We give a survey on existing approaches...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Literary and Linguistic Computing
سال: 2005
ISSN: 0268-1145,1477-4615
DOI: 10.1093/llc/fqh046